Task Allocation Strategy of Spatial Crowdsourcing Based on Deep Reinforcement Learning
NI Zhiwei1,2, LIU Hao1,2, ZHU Xuhui1,2, ZHAO Yang1,2, RAN Jiamin1,2
1. School of Management, Hefei University of Technology, Hefei 230009 2. Key Laboratory of Process Optimization and Intelligent Deci-sion-Making, Ministry of Education, Hefei University of Tech-nology, Hefei 230009
Abstract:In the traditional dynamic online task allocation strategy, it is difficult to effectively make use of historical data for learning and the impact of current decisions on future revenue is not taken into account. Therefore, a task allocation strategy of spatial crowdsourcing based on deep reinforcement learning is proposed. Firstly, maximizing long-term cumulative income is regarded as an objective function and the task assignment problem is transformed into the solution of Q value of state action and the one-to-one distribution between workers and tasks by modeling from the perspective of a single crowdsourcing worker grounded on Markov decision process. Secondly, the improved deep reinforcement learning algorithm is applied to learn the historical task data offline to construct the prediction model with respect to Q value. Finally, Q value in real time gained by the model in the dynamic online distribution process is regarded as a side weight of KM algorithm. The optimal distribution of global cumulative returns can be achieved. The results of comparative experiment on the real taxi travel dataset show that the proposed strategy increases the long-term cumulative income while the number of workers is within a certain scale.
倪志伟, 刘浩, 朱旭辉, 赵杨, 冉家敏. 基于深度强化学习的空间众包任务分配策略[J]. 模式识别与人工智能, 2021, 34(3): 191-205.
NI Zhiwei, LIU Hao, ZHU Xuhui, ZHAO Yang, RAN Jiamin. Task Allocation Strategy of Spatial Crowdsourcing Based on Deep Reinforcement Learning. , 2021, 34(3): 191-205.
[1] KAZEMI L, SHAHABI C. GeoCrowd: Enabling Query Answering with Spatial Crowdsourcing // Proc of the 20th International Confe-rence on Advances in Geographic Information Systems. New York, USA: ACM, 2012: 189-198. [2] TONG Y X, SHE J Y, DING B L, et al. Online Mobile Micro-task Allocation in Spatial Crowdsourcing // Proc of the 32nd IEEE International Conference on Data Engineering. Washington, USA: IEEE, 2016: 49-60. [3] HASSAN U U, CURRY E. A Multi-armed Bandit Approach to Online Spatial Task Assignment // Proc of the 11th IEEE International Conference on Ubiquitous Intelligence and Computing. Washington, USA: IEEE, 2014: 212-219. [4] TONG Y X, SHE J Y, DING B L, et al. Online Minimum Matching in Real-Time Spatial Data: Experiments and Analysis. Proceedings of the VLDB Endowment, 2016, 9(12): 1053-1064. [5] 李 洋,贾梦迪,杨文彦,等.基于树分解的空间众包最优任务分配算法.软件学报, 2018, 29(3): 824-838. (LI Y, JIA M D, YANG W Y, et al. Optimal Task Assignment Algorithm Based on Tree-Decouple in Spatial Crowdsourcing. Journal of Software, 2018, 29(3): 824-838.) [6] CHENG P, LIAN X, CHEN L, et al. Task Assignment on Multi-skill Oriented Spatial Crowdsourcing. IEEE Transactions on Know-ledge and Data Engineering, 2016, 28(8): 2201-2215. [7] 范泽军,沈立炜,彭 鑫,等.基于约束的空间众包多阶段任务分配.计算机学报, 2019, 42(12): 2722-2741. (FAN Z J, SHEN L W, PENG X, et al. Multi Stage Task Allocation on Constrained Spatial Crowdsourcing. Chinese Journal of Computers, 2019, 42(12): 2722-2741.) [8] CHENG P, LIAN X, CHEN Z, et al. Reliable Diversity-Based Spatial Crowdsourcing by Moving Workers. Proceedings of the VLDB Endowment, 2015, 8(10): 1022-1033. [9] 宋天舒,童咏昕,王立斌,等.空间众包环境下的3类对象在线任务分配.软件学报, 2017, 28(3): 611-630. (SONG T S, TONG Y X, WANG L B, et al. Online Task Assignment for Three Types of Objects under Spatial Crowdsourcing Environment. Journal of Software, 2017, 28(3): 611-630.) [10] 童咏昕,袁 野,成雨蓉,等.时空众包数据管理技术研究综述.软件学报, 2017, 28(1): 35-58. (TONG Y X, YUAN Y, CHENG Y R, et al. Survey on Spatiotemporal Crowdsourced Data Management Techniques. Journal of Software, 2017, 28(1): 35-58.) [11] TO H, SHAHABI C, KAZEMI L. A Server-Assigned Spatial Crow-dourcing Framework. ACM Transactions on Spatial Algorithms and Systems, 2015, 1(1). DOI: 10.1145/2729713. [12] WANG Y D, YIN H Z, CHEN H X, et al. Origin-Destination Matrix Prediction via Graph Convolution: A New Perspective of Pa-ssenger Demand Modeling // Proc of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA; ACM, 2019: 1227-1235. [13] ZHANG L Y, HU T, MIN Y, et al. A Taxi Order Dispatch Model Based on Combinatorial Optimization // Proc of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2017: 2151-2159. [14] ZHENG L B, CHEN L, YE J P. Order Dispatch in Price-Aware Ridesharing. Proceedings of the VLDB Endowment, 2018, 11(8): 853-865. [15] YAN C W, ZHU H L, KOROLKO N, et al. Dynamic Pricing and Matching in Ride-Hailing Platforms. Naval Research Logistics, 2020, 67(8): 705-724. [16] QIN Z W, TANG X C, JIAO Y, et al. Deep Reinforcement Lear-ning for Ride-Sharing Dispatching and Repositioning // Proc of the 28th International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 6566-6568. [17] HOLLER J, VUORIO R, QIN Z W, et al. Deep Reinforcement Learning for Multi-driver Vehicle Dispatching and Repositioning Problem // Proc of the IEEE International Conference on Data Mining. Washington, USA: IEEE, 2019: 1090-1095. [18] SUTTON R S, BARTO A G. Introduction to Reinforcement Lear-ning. Cambridge, USA: The MIT Press, 1998. [19] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-Level Con-trol through Deep Reinforcement Learning. Nature, 2015, 518(7540): 529-533. [20] WEI H, ZHENG C J, YAO H X, et al. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control // Proc of the 24th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining. New York, USA: ACM, 2018: 2496-2505. [21] CHU T S, WANG J, CODECÀ L, et al. Multi-agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control. IEEE Transactions on Intelligent Transportation Systems, 2019, 21(3): 1086-1095. [22] ZANG X S, YAO H X, ZHENG G J, et al. MetaLight: Value-Based Meta-Reinforcement Learning for Traffic Signal Control. Proc of the AAAI Conference on Artificial Intelligence, 2020, 34(1): 1153-1160. [23] MIROWSKI P, GRIMES M, MALINOWSKI M, et al. Learning to Navigate in Cities without a Map // BENGIO S, WALLACH H, LAROCHELLE H, et al., eds. Advances in Neural Information Processing Systems 31. Cambridge, USA: The MIT Press, 2018: 2419-2430. [24] BRUNNER G, RICHTER O, WANG Y Y, et al. Teaching a Machine to Read Maps with Deep Reinforcement Learning // Proc of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2018: 2763-2770. [25] JINDAL I, QIN Z W, CHEN X W, et al. Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining // Proc of the IEEE International Conference on Big Data. Washington, USA: IEEE, 2018: 1417-1426. [26] AL-ABBASI A O, GHOSH A, AGGARWAL V. DeepPool: Distributed Model-Free Algorithm for Ride-Sharing Using Deep Reinforcement Larning. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(12): 4714-4727. [27] XU Z, LI Z X, GUAN Q W, et al. Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach // Proc of the 24th ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining. New York, USA: ACM, 2018: 905-913. [28] WANG Z D, QIN Z W, TANG X C, et al. Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispat-ching // Proc of the IEEE International Conference on Data Mining. Washington, USA: IEEE, 2018: 617-626. [29] LI M N, QIN Z W, JIAO Y, et al. Efficient Ridesharing Order Dispatching with Mean Field Multi-agent Reinforcement Learning // Proc of the World Wide Web Conference. New York, USA: ACM, 2019: 983-994. [30] VAN HASSELT H, GUEZ A, SILVER D. Deep Reinforcement Learning with Double Q-Learning // Proc of the 30th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2016: 2094-2100. [31] GAO G J, XIAO M J, ZHAO Z H. Optimal Multi-taxi Dispatch for Mobile Taxi-Hailing Systems // Proc of the 45th International Conference on Parallel Processing. Washington, USA: IEEE, 2016: 294-303. [32] BUSONIU L, BABUKA R, DE SCHUTTER B. Multi-agent Reinforcement Learning: An Overview // SRINIVASAN D, JAIN L C, eds. Innovations in Multi-agent Systems and Applications. Berlin, Germany: Springer, 2010, I: 183-221.